R: barplots showing the relative strengths of correlation
What do you wanna do?:
Create a plot like this. We have one dependent variable (like Essay score) that you would like to predict using a series of linguistic indices. This is not pairwise correlation analyses.
Change the color and the shape of the point according to the overall construct, or index category
Print the correlation coefficients
Show the error bars that reflect Confidence Intervals
https://gyazo.com/5ad8b1d3a615155ffe5cb7d632155524
R packages used:
#ggplot2
#psych
First you define the list of variable names (full name) and index categories. The variable name defined here is going to be used in the plot. The index category is used to differentiate the color.
code::R
`{r}
varnames <- rbind("McD CD", "USF CD", "Verb—Nsubject DeltaP Dep", "Verb—Dobj (MI)", "Verb—Advmod Delta P Strongest", "Noun—Amod (MI)", "Adjective Frequency (Logged)", "Adverb Frequency (Logged)", "Main verb Frequency (Logged)", "Content word lemma Frequency", "Lemma bigram DeltaP Strongest")
IndexCategory <- rbind("Contextual distinctiveness", "Contextual distinctiveness", "Dependency bigram", "Dependency bigram", "Dependency bigram", "Dependency bigram", "Word Frequency", "Word Frequency", "Word Frequency", "Word Frequency", "Bigram")
`
Next, run correlation analysis and store the result into a new matrix using #psych
code::R
`{r}
correlation <- corr.test(y = a2$Score, x = a2,2:12, method = "pearson") #y = dependent variable, x = independent variables
cor_result <- cbind(varnames, IndexCategory, print(correlation, short = F, digit = 3)) #combining the varname, index category and the result of correlation
cor_result$cor_abs <- abs(cor_result$raw.r) #adding the absolute strengths of correlation -> will be used to sort the variable according to the strengths.
`
Finally, plot them using #ggplot2
code::R
`{r correlation plots}
ggplot(cor_result, aes(x = reorder(varnames, cor_abs), y = raw.r, fill = IndexCategory, shape = IndexCategory), ymax = .5, ymin = -.5)+ #reorder() will sort the order of variable, y = point estimate of correlation
geom_bar(stat = "identity", width = .7) + #this will plot the bar
geom_pointrange(aes(x = reorder(varnames, cor_abs), y = raw.r, ymin = raw.lower, ymax = raw.upper, width = .15)) + #point range plots the CI of the correlations
ylim(-1, 1) + #set the possible range of correlation (-1 – 1)
geom_text(aes(y = -.79, label = raw.r, size = 1.7), hjust = "outward", family = "serif") + #print the point estimate of corrletation coeffients
coord_flip() + #make the plot horizontal
theme_bw()+ #change the background color
labs(x = "Lexical and phraseological indices", y = "Pearson Correlation Coefficients") +
theme(legend.position="bottom") + #naming the axes
scale_size(guide = 'none')
`